Microbiome Data Visualization

Get a public 16S-demo dataset

Using dietswap from microbiome package

phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 130 taxa and 222 samples ]
sample_data() Sample Data:       [ 222 samples by 8 sample variables ]
tax_table()   Taxonomy Table:    [ 130 taxa by 3 taxonomic ranks ]

Explore variables and missing values

# A tibble: 14 × 4
   variable               type      unique p_zeros
   <chr>                  <chr>      <int>   <dbl>
 1 sample_id              character  28860     0  
 2 otu                    character    130     0  
 3 sample                 character    222     0  
 4 abundance              numeric     1268    20.6
 5 subject                factor        38     0  
 6 sex                    factor         2     0  
 7 nationality            factor         2     0  
 8 group                  factor         3     0  
 9 timepoint              integer        6     0  
10 timepoint.within.group integer        2     0  
11 bmi_group              factor         3     0  
12 phylum                 character      8     0  
13 family                 character     22     0  
14 genus                  character    130     0  

Frequency distributions for categoric variables

     sex frequency percentage cumulative_perc
1   male     15600      54.05           54.05
2 female     13260      45.95          100.00

  nationality frequency percentage cumulative_perc
1         AAM     15990      55.41           55.41
2         AFR     12870      44.59          100.00

  group frequency percentage cumulative_perc
1    ED      9750      33.78           33.78
2    HE      9750      33.78           67.56
3    DI      9360      32.43          100.00

   bmi_group frequency percentage cumulative_perc
1      obese     11700      40.54           40.54
2 overweight      9880      34.23           74.77
3       lean      7280      25.23          100.00
[1] "Variables processed: sex, nationality, group, bmi_group"

Correlation of numeric varaiables

   Variable   sex
1       sex  1.00
2 abundance  0.00
3 timepoint -0.01

Boxplot

Density histogram

Predictive model performance

Gain and lift performance curve

Dataset: heart_disease from funModeling

   Population   Gain Lift Score.Point
1          10  20.86 2.09   0.8185793
2          20  35.97 1.80   0.6967124
3          30  48.92 1.63   0.5657817
4          40  61.15 1.53   0.4901940
5          50  69.06 1.38   0.4033640
6          60  78.42 1.31   0.3344170
7          70  87.77 1.25   0.2939878
8          80  92.09 1.15   0.2473671
9          90  96.40 1.07   0.1980453
10        100 100.00 1.00   0.1195511

Coordinate plot: Profile mean clusters

       cyl  mpg  disp    hp  drat    wt  qsec vs am gear carb
1        4 26.0 108.0  91.0 4.080 2.200 18.90  1  1    4  2.0
2        6 19.7 167.6 110.0 3.900 3.210 18.30  1  0    4  4.0
3        8 15.2 350.5 192.5 3.120 3.760 17.18  0  0    3  3.5
4 All_Data 19.2 196.3 123.0 3.695 3.325 17.71  0  0    4  2.0

Snakemake workflow

Heatmaps

Using qiime2R

Using ggplot

Bray-Curtis

Jaccard

Bray-Curtis and Jaccard

Using microViz

Heatmap without sample annotation

Heatmap with sample annotation



Jitter plots



Line plots



PCoA ordination




References

[1]
Buza, T. M., Tonui, T., Stomeo, F., Tiambo, C., Katani, R., Schilling, M., … Kapur, V. (2019). iMAP: An integrated bioinformatics and visualization pipeline for microbiome data analysis. BMC Bioinformatics, 20. https://doi.org/10.1186/S12859-019-2965-4



Appendix

Project main tree

.
├── LICENSE
├── README.md
├── Rplots.pdf
├── config
│   └── config.yml
├── css
├── dags
│   ├── rulegraph.png
│   └── rulegraph.svg
├── data
│   ├── feature_table.qza
│   ├── features.csv
│   ├── metadata.csv
│   ├── processed_data.rda
│   ├── processed_objects.rda
│   ├── rooted_tree.qza
│   ├── sample_metadata.tsv
│   ├── shannon.csv
│   ├── shannon_vector.qza
│   ├── taxonomy.csv
│   ├── taxonomy.qza
│   └── unweighted_unifrac_pcoa.qza
├── figures
│   ├── bmi_group.jpeg
│   ├── boxplot-1.png
│   ├── boxplot-2.png
│   ├── bray-1.png
│   ├── bray_jcard-1.png
│   ├── coord-1.png
│   ├── freq_catvars-1.png
│   ├── freq_catvars-2.png
│   ├── freq_catvars-3.png
│   ├── freq_catvars-4.png
│   ├── ggplot_heatmap.png
│   ├── ggplot_heatmap.svg
│   ├── glm-1.png
│   ├── group.jpeg
│   ├── histdens-1.png
│   ├── histdens-2.png
│   ├── jaccard-1.png
│   ├── microviz_heatmap.png
│   ├── microviz_heatmap.svg
│   ├── microvizhtmp-1.png
│   ├── micrvizannothtmp-1.png
│   ├── nationality.jpeg
│   ├── q2r_barplot.png
│   ├── q2r_barplot.svg
│   ├── q2r_heatmap.png
│   ├── q2r_heatmap.svg
│   ├── q2r_jitterplot.png
│   ├── q2r_jitterplot.svg
│   ├── q2r_lineplot.png
│   ├── q2r_lineplot.svg
│   ├── q2r_pcoa.png
│   ├── q2r_pcoa.svg
│   ├── sex.jpeg
│   ├── taxahtmp-1.png
│   ├── unnamed-chunk-1-1.png
│   ├── unnamed-chunk-2-1.png
│   ├── unnamed-chunk-3-1.png
│   ├── unnamed-chunk-4-1.png
│   ├── unnamed-chunk-5-1.png
│   ├── unnamed-chunk-6-1.png
│   └── unnamed-chunk-7-1.png
├── images
│   ├── bkgd.png
│   ├── coders.png
│   ├── ml.png
│   ├── smkreport
│   │   └── screenshot.png
│   └── vizcover.png
├── imap-data-visualization.Rproj
├── index.Rmd
├── library
│   ├── apa.csl
│   ├── export.bib
│   ├── imap.bib
│   ├── packages.bib
│   └── references.bib
├── report.html
├── results
│   └── project_tree.txt
├── styles.css
└── workflow
    ├── Snakefile
    ├── envs
    │   └── environment.yml
    ├── report
    │   ├── barplot.rst
    │   ├── boxplots.rst
    │   ├── hcluster.rst
    │   ├── heatmap.rst
    │   ├── jitterplot.rst
    │   ├── lineplot.rst
    │   ├── nmds.rst
    │   ├── pcaordi.rst
    │   ├── pcoa.rst
    │   ├── scatter.rst
    │   ├── venndiagram.rst
    │   ├── volcano.rst
    │   └── workflow.rst
    ├── rules
    │   ├── barplot.smk
    │   ├── gh_pages.smk
    │   ├── heatmap.smk
    │   ├── jitter.smk
    │   ├── lineplot.smk
    │   ├── pcoa.smk
    │   ├── processed_data.smk
    │   ├── project_tree.smk
    │   ├── rulegraph.smk
    │   ├── smk_report.smk
    │   ├── venn.smk
    │   └── visual_types.smk
    └── scripts
        ├── README.md
        ├── alpha.R
        ├── barplot.R
        ├── build_bs4_book.bash
        ├── common.R
        ├── custom_theme.R
        ├── heatmap.R
        ├── jitterplot.R
        ├── lefse.R
        ├── lineplot.R
        ├── pcoa.R
        ├── processed_data.R
        ├── qiime2R.R
        ├── qiime2csv.R
        ├── rarefy.R
        ├── read_matrix.R
        ├── render.R
        ├── rules_dag.sh
        ├── smk_html_report.sh
        ├── tree.sh
        └── volcanoplot.R

14 directories, 122 files



Troubleshooting of FAQs

  1. Question
    • Answer
  2. Question
    • Answer